January 17, 2018

Predict wine quality from its physicochemical properties.

Wine Dataset

  • 6500 red and white Portuguese “Vinho Verde” wines
  • Features: Physicochemical properties
  • Quality assessed by blind tasting, from 0 (very bad) to 10 (excellent)

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

Quality Distribution

How can we solve this problem?

  • Hand-written rules
  • Statistical Model
  • Machine Learning

Programing vs. Machine Learning

Statistics vs. Machine Learning

linear \(\Rightarrow\) non-linear

additive \(\Rightarrow\) interactions

theory-driven \(\Rightarrow\) optimization-driven

Machine Learning (supervised)

Machine Learning (supervised)

Machine Learning (supervised)

Step 1: Find data

Step 2: Apply Machine Learning

Random Forest

Train Random Forest to Predict Quality

Mean absolute error on test data (cross-validated): 0.44

Prediction vs. Actual Quality

Step 3: Profit

We want to know:

  • Which wine properties are the most predictive for quality?
  • How does a property affect the predicted wine quality?
  • Can we extract a “Rule of Thumb” from the black box?
  • Why did a wine get a certain prediction?
  • How do we have to change a wine to achieve a different prediction?

Looking inside the black box

Which features are important?

Permutation Feature Importance

Which features are important?

How do features affect predictions?

Accumulated Local Effects

Accumulated Local Effects

Accumulated Local Effects

Accumulated Local Effects

Accumulated Local Effects

Effect of Alcohol

Effect of Volatile Acidity

How do features affect predictions?

Rule of thumb for wine quality?

Surrogate Model

Surrogate Model

Tree explains 37.36% of black box prediction variance.

Explain individual predictions

Shapley Value

Explain best wine

Explain worst wine

Improve worst wine?

Counterfactual Explanations

Counterfactual Explanations

Counterfactual Explanations

Improve worst wine?

How do we get the wine above predicted quality of 5?

  • Decreasing volatile acidity to 0.2 yields predicted quality of 5.09
  • Decreasing volatile acidity to 1.0 and increasing alcohol to 13% yields predicted quality of 5.01

Why interpretability?

Interested in learning more?

Backup

Units in Wine dataset

  • fixed acidity g(tartaric acid)/dm3
  • volatile acidity: g(acetric acid/dm3)
  • citric acid: g/dm3
  • residual sugar: g/dm3
  • chlorides: g(sodium chloride)/dm3
  • free sulfur dioxide: mg/dm3
  • total sulfur dioxide: mg/dm3
  • density> g/cm3
  • pH
  • sulphates: g(postassium sulphate) / dm3
  • alcohol vol.%

What tools do we have?

Interpretable Models

Interpretable Models

Intepretable Model: Linear Regression

Intepretable Model: Decision Tree

Interpretable Model: Decision Rules

IF \(90m^2\leq \text{size} < 110m^2\) AND location \(=\) “good” THEN rent is between 1540 and 1890 EUR

Model-specific Methods

Model-specific Methods

Model-specific Methods

Layerwise Relevance Propagation (LRP)

Bach, Sebastian, et al. “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation.” PloS one 10.7 (2015): e0130140.

Model-specific Methods

Model-agnostic Methods

Model-agnostic Methods

Model-agnostic Methods

Model-agnostic Methods: Global Surrogate

Model-agnostic Methods: Local Surrogate

Example-focused Methods

TODO: Graphic for counterfactuals